Goto

Collaborating Authors

 adversarial program


AdversarialReprogrammingRevisited

Neural Information Processing Systems

Adversarial reprogramming, introduced by Elsayed, Goodfellow, and SohlDickstein, seeks to repurpose a neural network to perform a different task, by manipulating its input without modifying its weights.



Why Adversarial Reprogramming Works, When It Fails, and How to Tell the Difference

arXiv.org Artificial Intelligence

Adversarial reprogramming allows repurposing a machine-learning model to perform a different task. For example, a model trained to recognize animals can be reprogrammed to recognize digits by embedding an adversarial program in the digit images provided as input. Recent work has shown that adversarial reprogramming may not only be used to abuse machine-learning models provided as a service, but also beneficially, to improve transfer learning when training data is scarce. However, the factors affecting its success are still largely unexplained. In this work, we develop a first-order linear model of adversarial reprogramming to show that its success inherently depends on the size of the average input gradient, which grows when input gradients are more aligned, and when inputs have higher dimensionality. The results of our experimental analysis, involving fourteen distinct reprogramming tasks, show that the above factors are correlated with the success and the failure of adversarial reprogramming.


Adversarial Reprogramming Revisited

arXiv.org Artificial Intelligence

Adversarial reprogramming, introduced by Elsayed, Goodfellow, and Sohl-Dickstein, seeks to repurpose a neural network to perform a different task, by manipulating its input without modifying its weights. We prove that two-layer ReLU neural networks with random weights can be adversarially reprogrammed to achieve arbitrarily high accuracy on Bernoulli data models over hypercube vertices, provided the network width is no greater than its input dimension. We also substantially strengthen a recent result of Phuong and Lampert on directional convergence of gradient flow, and obtain as a corollary that training two-layer ReLU neural networks on orthogonally separable datasets can cause their adversarial reprogramming to fail. We support these theoretical results by experiments that demonstrate that, as long as batch normalisation layers are suitably initialised, even untrained networks with random weights are susceptible to adversarial reprogramming. This is in contrast to observations in several recent works that suggested that adversarial reprogramming is not possible for untrained networks to any degree of reliability.


Cross-modal Adversarial Reprogramming

arXiv.org Artificial Intelligence

With the abundance of large-scale deep learning models, it has become possible to repurpose pre-trained networks for new tasks. Recent works on adversarial reprogramming have shown that it is possible to repurpose neural networks for alternate tasks without modifying the network architecture or parameters. However these works only consider original and target tasks within the same data domain. In this work, we broaden the scope of adversarial reprogramming beyond the data modality of the original task. We analyze the feasibility of adversarially repurposing image classification neural networks for Natural Language Processing (NLP) and other sequence classification tasks. We design an efficient adversarial program that maps a sequence of discrete tokens into an image which can be classified to the desired class by an image classification model. We demonstrate that by using highly efficient adversarial programs, we can reprogram image classifiers to achieve competitive performance on a variety of text and sequence classification benchmarks without retraining the network.


4 Types Of Privacy Attacks Every Machine Learning Startup Should Watch Out For - Analytics India Magazine

#artificialintelligence

With the advent of APIs that offer state-of-the-art services a click away, setting up a machine learning shop has become more accessible. But with rapid democratisation, there is a risk of non-ML players who have jumped the gun, finding themselves in a flurry of privacy attacks, never been heard of before. In a first of its kind survey carried out on ML privacy by a team from Czech Technical University, the researchers address the different ways an ML application can be vulnerable. In privacy-related attacks, wrote the researchers, an adversary's goal is related to gaining knowledge, not intended to be shared, such as knowledge about the training data or information about the model, or even extracting information about properties of the data. Black-box attacks are those attacks where the adversary does not know the model parameters, architecture or training data.


How to trick deep learning algorithms into doing new things

#artificialintelligence

This article is part of our reviews of AI research papers, a series of posts that explore the latest findings in artificial intelligence. Two things often mentioned with deep learning are "data" and "compute resources." You need a lot of both when developing, training, and testing deep learning models. When developers don't have a lot of training samples or access to very powerful servers, they use transfer learning to finetune a pre-trained deep learning model for a new task. At this year's ICML conference, scientists at IBM Research and Taiwan's National Tsing Hua University Research introduced "black-box adversarial reprogramming" (BAR), an alternative repurposing technique that turns a supposed weakness of deep neural networks into a strength.


Adversarial Reprogramming: Exploring A New Paradigm of Neural Network Vulnerabilities

#artificialintelligence

Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to cause the model to make a mistake. An adversarial attacker could target autonomous vehicles by using stickers or paint to create an adversarial stop sign that the vehicle would interpret as a'yield' or other sign. A confused car on a busy day is a potential catastrophe packed in a 2000 pound metal box. So far, the majority of adversarial attacks, the attacker designed few perturbations to produce an output specific to a given input. The attacks consisted of untargeted attacks that aim to degrade the performance of a model.


Adversarial Reprogramming of Sequence Classification Neural Networks

arXiv.org Machine Learning

Adversarial Reprogramming has demonstrated success in utilizing pre-trained neural network classifiers for alternative classification tasks without modification to the original network. An adversary in such an attack scenario trains an additive contribution to the inputs to repurpose the neural network for the new classification task. While this reprogramming approach works for neural networks with a continuous input space such as that of images, it is not directly applicable to neural networks trained for tasks such as text classification, where the input space is discrete. Repurposing such classification networks would require the attacker to learn an adversarial program that maps inputs from one discrete space to the other. In this work, we introduce a context-based vocabulary remapping model to reprogram neural networks trained on a specific sequence classification task, for a new sequence classification task desired by the adversary. We propose training procedures for this adversarial program in both white-box and black-box settings. We demonstrate the application of our model by adversarially repurposing various text-classification models including LSTM, bi-directional LSTM and CNN for alternate classification tasks.


Adversarial Reprogramming of Neural Networks

arXiv.org Machine Learning

Deep neural networks are susceptible to adversarial attacks. In computer vision, well-crafted perturbations to images can cause neural networks to make mistakes such as identifying a panda as a gibbon or confusing a cat with a computer. Previous adversarial examples have been designed to degrade performance of models or cause machine learning models to produce specific outputs chosen ahead of time by the attacker. We introduce adversarial attacks that instead reprogram the target model to perform a task chosen by the attacker---without the attacker needing to specify or compute the desired output for each test-time input. This attack is accomplished by optimizing for a single adversarial perturbation, of unrestricted magnitude, that can be added to all test-time inputs to a machine learning model in order to cause the model to perform a task chosen by the adversary when processing these inputs---even if the model was not trained to do this task. These perturbations can be thus considered a program for the new task. We demonstrate adversarial reprogramming on six ImageNet classification models, repurposing these models to perform a counting task, as well as two classification tasks: classification of MNIST and CIFAR-10 examples presented within the input to the ImageNet model.